refactor(store,cache)!: optimize cache restore as O(1)#3134
refactor(store,cache)!: optimize cache restore as O(1)#3134julienrbrt merged 13 commits intomainfrom
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds a persisted in-flight DA snapshot mechanism to the block cache, extends cache manager APIs for height-based DA lookups, adds restart fallback logic, removes store prefix-metadata querying (and related types/tests), and expands tests covering snapshot restore and restart scenarios. Changes
Sequence Diagram(s)sequenceDiagram
participant Node
participant CacheManager
participant Cache
participant Store
participant DARetriever
Node->>CacheManager: RestoreFromStore()
activate CacheManager
CacheManager->>Store: Read snapshotKey
Store-->>CacheManager: snapshot (or empty)
alt snapshot present
CacheManager->>Cache: Restore in-flight height→DA entries (placeholders)
end
CacheManager->>Store: Read finalized-tip metadata (HeightToDAHeight)
Store-->>CacheManager: max DA height metadata
CacheManager->>Cache: Initialize maxDAHeight
deactivate CacheManager
DARetriever->>CacheManager: setNodeHeightToDAHeight(height, hash)
activate CacheManager
alt hash present in cache
CacheManager->>Cache: lookup by hash → daHeight
else hash missing (post-restart)
CacheManager->>Cache: HeightPlaceholderKey lookup → placeholder daHeight
end
CacheManager->>Store: Persist HeightToDAHeight metadata
deactivate CacheManager
Node->>CacheManager: SaveToStore()
activate CacheManager
CacheManager->>Cache: Encode current snapshot
Cache-->>CacheManager: snapshot bytes
CacheManager->>Store: Write snapshotKey
deactivate CacheManager
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).
|
|
Claude finished @julienrbrt's task in 2m 15s —— View job Code Review:
|
| # | File | Line | Severity |
|---|---|---|---|
| 1 | generic_cache.go |
245 | |
| 2 | generic_cache.go |
154,161,204 | ℹ️ Low — store errors on mutation silently dropped |
| 3 | manager_test.go |
117 | 🔧 Lint — unused function parameters |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3134 +/- ##
==========================================
+ Coverage 61.00% 61.12% +0.12%
==========================================
Files 113 113
Lines 11700 11676 -24
==========================================
Hits 7137 7137
+ Misses 3753 3736 -17
+ Partials 810 803 -7
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
block/internal/cache/manager_test.go (2)
459-462: The O(1) restore claim is documented but not asserted.Line 459-Line 462 describe a single metadata read, but this test currently only validates restored state. Consider an instrumented/mock store to assert metadata-read count during
NewManagerso this performance contract does not regress.As per coding guidelines, "Mock external dependencies using mockery".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@block/internal/cache/manager_test.go` around lines 459 - 462, The test claims O(1) restore but doesn't assert metadata read count; update the test around the NewManager call to use a mock store (generated with mockery for the store interface) that tracks GetMetadata/Get calls and assert exactly one metadata read during NewManager(cfg, st, zerolog.Nop()). Specifically, replace the plain store instance with the mock store, set expectations/return values for the metadata read methods used by Manager (e.g., GetMetadata/GetMetadataKey/Get), call SaveToStore beforehand to create the snapshot, then create m2 via NewManager and assert the mock's metadata-read counter or mock expectation was hit exactly once to enforce the O(1) contract.
117-125: Remove unused placeholder parameters fromwriteHeightDAMeta.The two trailing string parameters are never used and are always passed as empty literals. Removing them keeps the helper function focused and aligns with Go best practices for meaningful parameter names.
♻️ Suggested cleanup
- writeHeightDAMeta := func(height, daH uint64, _, _ string) { + writeHeightDAMeta := func(height, daH uint64) { b := make([]byte, 8) binary.LittleEndian.PutUint64(b, daH) require.NoError(t, st.SetMetadata(ctx, pkgstore.GetHeightToDAHeightHeaderKey(height), b)) binary.LittleEndian.PutUint64(b, daH) require.NoError(t, st.SetMetadata(ctx, pkgstore.GetHeightToDAHeightDataKey(height), b)) } - writeHeightDAMeta(1, 100, "", "") - writeHeightDAMeta(2, 101, "", "") + writeHeightDAMeta(1, 100) + writeHeightDAMeta(2, 101)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@block/internal/cache/manager_test.go` around lines 117 - 125, The helper function writeHeightDAMeta includes two unused trailing string parameters; remove those parameters from its signature and all calls to it (change writeHeightDAMeta := func(height, daH uint64, _, _ string) { ... } to writeHeightDAMeta := func(height, daH uint64) { ... } and update calls writeHeightDAMeta(1, 100, "", "") and writeHeightDAMeta(2, 101, "", "") to writeHeightDAMeta(1, 100) and writeHeightDAMeta(2, 101)); keep the body unchanged (it should still call st.SetMetadata with pkgstore.GetHeightToDAHeightHeaderKey and pkgstore.GetHeightToDAHeightDataKey) so tests compile and behavior is preserved.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@block/internal/cache/generic_cache_test.go`:
- Around line 320-332: TestCache_ClearFromStore is asserting removal of
"clear-test/hash1" but the cache only writes the snapshot key
"clear-test/__snap", so the test can pass while ClearFromStore is broken; update
the test to check for removal of the snapshot key instead. Specifically, in
TestCache_ClearFromStore (which uses testMemStore, NewCache[testItem],
c.setDAIncluded and c.ClearFromStore(ctx)), replace the GetMetadata assertion
for "clear-test/hash1" with an assertion that "clear-test/__snap" is missing (or
verify st.GetMetadata returns an error for "clear-test/__snap"), ensuring the
test fails if ClearFromStore does not delete the snapshot key.
In `@block/internal/cache/generic_cache.go`:
- Around line 228-229: The calls to the backing store (e.g.,
c.store.SetMetadata, c.store.Set, c.store.Clear) in SaveToStore and
ClearFromStore currently ignore errors and always return nil; change those sites
to capture the returned error, return or wrap it from the enclosing methods
(SaveToStore, ClearFromStore) so callers see persistence failures, and ensure
snapshotKey()/encodeSnapshot(entries) invocation errors (if any) are also
surfaced; update all instances noted (the SetMetadata call around
snapshotKey/encodeSnapshot and the Set/Clear calls in the blocks corresponding
to lines ~297-303 and ~306-311) to check err != nil and return that error (or a
wrapped error with context).
- Around line 266-269: The restore path currently treats any error from
c.store.GetMetadata(ctx, c.snapshotKey()) as a benign "nothing to restore";
change this to only ignore a not-found error and propagate all other errors with
context. Update the code around c.store.GetMetadata(...) (in the restore logic
that calls c.snapshotKey()) to check the returned error type/value (e.g.,
storage.ErrNotFound or equivalent) and return nil for that case only; for any
other err, wrap it with a contextual message (mentioning snapshotKey or
GetMetadata) and return it so real storage/read failures are not suppressed.
---
Nitpick comments:
In `@block/internal/cache/manager_test.go`:
- Around line 459-462: The test claims O(1) restore but doesn't assert metadata
read count; update the test around the NewManager call to use a mock store
(generated with mockery for the store interface) that tracks GetMetadata/Get
calls and assert exactly one metadata read during NewManager(cfg, st,
zerolog.Nop()). Specifically, replace the plain store instance with the mock
store, set expectations/return values for the metadata read methods used by
Manager (e.g., GetMetadata/GetMetadataKey/Get), call SaveToStore beforehand to
create the snapshot, then create m2 via NewManager and assert the mock's
metadata-read counter or mock expectation was hit exactly once to enforce the
O(1) contract.
- Around line 117-125: The helper function writeHeightDAMeta includes two unused
trailing string parameters; remove those parameters from its signature and all
calls to it (change writeHeightDAMeta := func(height, daH uint64, _, _ string) {
... } to writeHeightDAMeta := func(height, daH uint64) { ... } and update calls
writeHeightDAMeta(1, 100, "", "") and writeHeightDAMeta(2, 101, "", "") to
writeHeightDAMeta(1, 100) and writeHeightDAMeta(2, 101)); keep the body
unchanged (it should still call st.SetMetadata with
pkgstore.GetHeightToDAHeightHeaderKey and pkgstore.GetHeightToDAHeightDataKey)
so tests compile and behavior is preserved.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 65ba1b1a-6fb8-4530-8f80-54d0c766c4ab
📒 Files selected for processing (14)
CHANGELOG.mdblock/internal/cache/generic_cache.goblock/internal/cache/generic_cache_test.goblock/internal/cache/manager.goblock/internal/cache/manager_test.goblock/internal/submitting/submitter.goblock/internal/submitting/submitter_test.gonode/full.gopkg/store/store.gopkg/store/store_test.gopkg/store/tracing.gopkg/store/tracing_test.gopkg/store/types.gotest/mocks/store.go
💤 Files with no reviewable changes (5)
- pkg/store/store_test.go
- pkg/store/tracing.go
- pkg/store/types.go
- pkg/store/store.go
- test/mocks/store.go
tac0turtle
left a comment
There was a problem hiding this comment.
nice job, love seeing the store cleanup
There was a problem hiding this comment.
🧹 Nitpick comments (1)
block/internal/cache/generic_cache.go (1)
245-258: Silent nil return for malformed snapshot data masks corruption.
decodeSnapshotreturnsnilwhen the buffer length isn't a multiple ofsnapshotEntrySize. This silently treats corrupted data the same as an empty snapshot, making data corruption hard to detect.Consider returning an error for the malformed case (non-empty but invalid length) to surface corruption during debugging.
♻️ Suggested improvement
-// decodeSnapshot deserialises a byte slice produced by encodeSnapshot. -// Returns nil if the slice is empty or not a multiple of snapshotEntrySize. -func decodeSnapshot(buf []byte) []snapshotEntry { - if len(buf) == 0 || len(buf)%snapshotEntrySize != 0 { - return nil +// decodeSnapshot deserialises a byte slice produced by encodeSnapshot. +// Returns nil, nil for empty buffer; returns error if buffer is malformed. +func decodeSnapshot(buf []byte) ([]snapshotEntry, error) { + if len(buf) == 0 { + return nil, nil + } + if len(buf)%snapshotEntrySize != 0 { + return nil, fmt.Errorf("malformed snapshot: length %d not a multiple of %d", len(buf), snapshotEntrySize) } entries := make([]snapshotEntry, len(buf)/snapshotEntrySize) for i := range entries { off := i * snapshotEntrySize entries[i].blockHeight = binary.LittleEndian.Uint64(buf[off:]) entries[i].daHeight = binary.LittleEndian.Uint64(buf[off+8:]) } - return entries + return entries, nil }This would require updating
RestoreFromStoreto handle the error.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@block/internal/cache/generic_cache.go` around lines 245 - 258, decodeSnapshot currently returns nil for malformed buffers which hides corruption; change its signature to return ([]snapshotEntry, error) and have it return an explicit error when buf is non-empty but len(buf)%snapshotEntrySize != 0, keeping the successful path returning entries and nil error; update any callers (notably RestoreFromStore) to handle the error (propagate or surface it) and adjust any tests/type usages accordingly; reference symbols: decodeSnapshot, snapshotEntry, snapshotEntrySize, and RestoreFromStore.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@block/internal/cache/generic_cache.go`:
- Around line 245-258: decodeSnapshot currently returns nil for malformed
buffers which hides corruption; change its signature to return ([]snapshotEntry,
error) and have it return an explicit error when buf is non-empty but
len(buf)%snapshotEntrySize != 0, keeping the successful path returning entries
and nil error; update any callers (notably RestoreFromStore) to handle the error
(propagate or surface it) and adjust any tests/type usages accordingly;
reference symbols: decodeSnapshot, snapshotEntry, snapshotEntrySize, and
RestoreFromStore.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: aa4d0446-cbc3-44ac-b69c-2d22a691f033
📒 Files selected for processing (4)
block/internal/cache/generic_cache.goblock/internal/cache/generic_cache_test.goblock/internal/executing/executor.gopkg/sequencers/single/sequencer_test.go
| entries, err := c.store.GetMetadataByPrefix(ctx, c.storeKeyPrefix) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to query metadata by prefix %q: %w", c.storeKeyPrefix, err) | ||
| return |
There was a problem hiding this comment.
nit: store and storeKeyPrefix must not be empty in normal use. why not move this guard to the constructor to fail fast?
| daHeightForHeader, ok := s.cache.GetHeaderDAIncluded(headerHash.String()) | ||
| if !ok { | ||
| return fmt.Errorf("header hash %s not found in cache", headerHash) | ||
| daHeightForHeader, ok = s.cache.GetHeaderDAIncludedByHeight(height) |
There was a problem hiding this comment.
here and L 458: why use this as fallback instead of default? you would save the headerHash time.
There was a problem hiding this comment.
good point, i will simplify
* main: chore: minor deduplication (#3139) refactor(store,cache)!: optimize cache restore as O(1) (#3134) ci: remove spamoor results from benchmark results per PR (#3138) build(deps): Bump rollup from 4.22.4 to 4.59.0 in /docs in the npm_and_yarn group across 1 directory (#3136) fix(block): fix blocktime timer usage (#3137) build(deps): bump core v1.0.0 (#3135)
Overview
Optimize cache restore as O(n) + prep v1.0.0.
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Chores